Enhanced Server Fault-tolerance Techniques for Seamless User Experience

نویسنده

  • Manish Marwah
چکیده

User applications, such as email, calendar, maps, are migrating from local desktop machines to data centers due to the many advantages offered by such a computing paradigm. Furthermore, this trend is creating a marked increase in the deployment of servers at data centers. To ride the price/performance curves for CPU, memory and other HW, inexpensive commodity machines although having low availability numbers are the most cost effective choices for a data center. However, increased server failures cause service outages and degrade user experience which in turn results in lost revenue for businesses. Also, emerging web applications put additional demands on server fault-tolerance. For example, if a user is browsing a map service like Google, Yahoo or MSN maps, a server failure leading to an outage of more than a few seconds is detectable by a user and hence degrades user experience. In this thesis, I propose three novel techniques aimed at improving server fault-tolerance: (1) ST-TCP, which is an extension of TCP to tolerate server failures. This is done by using an active-backup which replicates the state of a primary and seamlessly takes over a TCP connection on primary server failure; (2) CRAFT, where the TCP splicing mechanism is enhanced to make it both fault-tolerant and more scalable; this then forms the basis of a scalable and fault-tolerant web server architecture that specifically addresses server faulttolerance issues for highly interactive or real time applications; and, (3) Call-preserving failover, which is an efficient and scalable fault-tolerance mechanism for migrating IP telephony calls to an alternate call controller.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid

Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...

متن کامل

Enhanced Server Fault Tolerance for Improved User Experience ; CU-CS-1037-08

Interactive applications such as email, calendar, and maps are migrating from local desktop machines to data centers due to the many advantages offered by such a computing environment. Furthermore, this trend is creating a marked increase in the deployment of servers at data centers. To ride the price/performance curves for CPU, memory and other hardware, inexpensive commodity machines are the ...

متن کامل

Client-Transparent Fault-Tolerant Web Service

Most of the existing fault tolerance schemes for Web servers detect server failure and route future client requests to backup servers. These techniques typically do not provide transparent handling of requests whose processing was in progress when the failure occurred. Thus, the system may fail to provide the user with confirmation for a requested transaction or clear indication that the transa...

متن کامل

Seamless Mobility with Personal Servers

We describe the concept and the taxonomy of personal servers, and their implications in seamless mobility. Personal servers could offer electronic services independently of network availability or quality, provide a greater flexibility in the choice of user access device, and support the key concept of continuous user experience. We describe the organization of mobile and remote personal server...

متن کامل

Fuxi: a Fault-Tolerant Resource Management and Job Scheduling System at Internet Scale

Scalability and fault-tolerance are two fundamental challenges for all distributed computing at Internet scale. Despite many recent advances from both academia and industry, these two problems are still far from settled. In this paper, we present Fuxi, a resource management and job scheduling system that is capable of handling the kind of workload at Alibaba where hundreds of terabytes of data ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007